[ET Device Support] PropagateDevicePass inserts H2D/D2H copy ops at delegate boundaries#18730
[ET Device Support] PropagateDevicePass inserts H2D/D2H copy ops at delegate boundaries#18730Gasoonjia wants to merge 2 commits intogh/gasoonjia/160/basefrom
Conversation
…elegate boundaries Extend PropagateDevicePass to insert explicit et_copy._h2d_copy and et_copy._d2h_copy ops at delegate boundaries, making the graph functional by explicitly transferring data between CPU and device memory. Key changes: - Inserts _h2d_copy before each delegate input, _d2h_copy after each output - Original input nodes stay CPU; h2d_copy output tagged as device - Getitem nodes inherit device; d2h_copy output tagged as CPU - Skip-copy optimizations via skip_h2d_for_method_inputs/skip_d2h_for_method_outputs - _parse_device_spec_value: lowercases string, raises ValueError for unknown types - _program.py passes config flags to PropagateDevicePass constructor Differential Revision: [D99636777](https://our.internmc.facebook.com/intern/diff/D99636777/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18730
Note: Links to docs will display an error until the docs builds have been completed. ❌ 65 New FailuresAs of commit 4b48ed3 with merge base b5ae0b9 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…elegate boundaries Extend PropagateDevicePass to insert explicit et_copy._h2d_copy and et_copy._d2h_copy ops at delegate boundaries, making the graph functional by explicitly transferring data between CPU and device memory. Key changes: - Inserts _h2d_copy before each delegate input, _d2h_copy after each output - Original input nodes stay CPU; h2d_copy output tagged as device - Getitem nodes inherit device; d2h_copy output tagged as CPU - Skip-copy optimizations via skip_h2d_for_method_inputs/skip_d2h_for_method_outputs - _parse_device_spec_value: lowercases string, raises ValueError for unknown types - _program.py passes config flags to PropagateDevicePass constructor Differential Revision: [D99636777](https://our.internmc.facebook.com/intern/diff/D99636777/) ghstack-source-id: 363318416 Pull Request resolved: #18730
This PR needs a
|
…py ops at delegate boundaries" Extend PropagateDevicePass to insert explicit et_copy._h2d_copy and et_copy._d2h_copy ops at delegate boundaries, making the graph functional by explicitly transferring data between CPU and device memory. Key changes: - Inserts _h2d_copy before each delegate input, _d2h_copy after each output - Original input nodes stay CPU; h2d_copy output tagged as device - Getitem nodes inherit device; d2h_copy output tagged as CPU - Skip-copy optimizations via skip_h2d_for_method_inputs/skip_d2h_for_method_outputs - _parse_device_spec_value: lowercases string, raises ValueError for unknown types - _program.py passes config flags to PropagateDevicePass constructor Differential Revision: [D99636777](https://our.internmc.facebook.com/intern/diff/D99636777/) [ghstack-poisoned]
digantdesai
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
Stack from ghstack (oldest at bottom):
Extend PropagateDevicePass to insert explicit et_copy._h2d_copy and
et_copy._d2h_copy ops at delegate boundaries, making the graph functional
by explicitly transferring data between CPU and device memory.
Key changes:
Differential Revision: D99636777